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Abstract 

The  purpose  of  this  research  was  to  develop  a  fundamental  framework  for  a  new  approach  to 
multiframe  translational  shift  estimation  in  image  processing.  This  thesis  sought  to  create  a  new 
multiframe  shift  estimator,  to  theoretically  prove  and  experimentally  test  key  properties  of  it,  and  to 
quantify  its  performance  according  to  several  metrics.  The  new  estimator  was  modeled  successfully  and 
was  proven  to  be  an  unbiased  estimator  under  certain  common  image  noise  conditions.  Furthermore  its 
performance  was  shown  to  be  superior  to  the  cross  correlation  shift  estimator,  a  robust  estimator  widely 
used  in  similar  image  processing  cases,  according  to  several  criteria. 

This  research  effort  led  to  the  derivation  of  a  lower  bound  of  estimation  performance  for  the 
multiframe  case.  This  valuable  data  analysis  tool  extends  current  boundary  derivations  to  include  prior 
information  about  the  random  shifting,  thereby  providing  a  more  precise  performance  boundary. 
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MULTIFRAME  SHIFT  ESTIMATION 


I.  Introduction 

All  imaging  systems  mounted  on  moving  platforms  suffer  from  stabilization 
issues.  Due  to  mounting  imperfections  a  given  imaging  system  is  in  a  constant  state  of 
movement.  Such  movement  is  generally  unpredictable  so  it  can  be  modeled  as  a  random 
vector.  This  means  that  the  imaging  system  is  pointing  at  a  slightly  different  scene  for 
each  image  acquisition  period.  These  global  translational  shift  differences  among  data 
frames  can  be  represented  by  horizontal  and  vertical  dimension  value  pairs. 

The  process  of  image  registration  involves  estimating  these  random  shifts  and 
then  removing  them.  Registration  is  an  important  step  for  post  processing  and  analysis 
techniques  such  as  sequential  frame  playback  (video),  frame  averaging,  and 
microscanning.  Video  playback  can  be  a  useful  visual  aid  for  identifying  scene 
information  in  low  light  or  high  noise  situations.  Without  registration  the  playback  can 
appear  jittery  or  shaky,  depending  on  the  degree  of  random  shifts  present.  Registration 
can  reduce  this  effect  and  stabilize  the  objects  within  the  video.  Frame  averaging  is  a 
helpful  tool  for  reducing  noise  corruption.  Photon  noise  is  typically  modeled  with  a 
Poisson  distribution  with  an  average  value  of  the  scene  without  noise.  Averaging  a  set  of 
noisy  frames  together  generally  results  in  a  scene  estimate  with  less  noise  corruption  than 
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any  single  frame.  Without  registration  the  misaligned  frames  average  to  a  blurry  result. 
Adding  the  registration  step  can  properly  line  up  the  frames  before  averaging,  yielding  a 
result  with  more  detail.  Microscanning  is  a  scene  estimation  technique  for  acquiring 
higher  spatial  frequency  content  than  what  the  imaging  system  is  able  to  produce  in  a 
single  frame.  It  is  particularly  useful  in  situations  of  aliasing,  where  the  scene  contains 
frequency  content  that  is  too  high  for  the  imaging  system  to  capture.  Sub-pixel 
registration,  combined  with  interpolation  and  appropriate  frame  combination,  allows 
some  of  the  higher  frequency  content  to  be  extracted  from  the  aliased  data  frames.  Each 
of  these  post  processing  and  analysis  techniques  work  only  as  well  as  the  registration 
algorithm  allows.  Accurate  registration  yields  better  playback,  averaging,  and 
microscanning  results  than  inaccurate  registration  does,  and  registration  performance  is 
directly  dependent  on  the  shift  estimation  that  drives  it. 

Existing  shift  estimation  algorithms  are  based  on  a  2  frame  approach.  One  frame 
is  used  as  a  reference  and  the  other  frame  is  registered  to  it.  The  result  is  a  horizontal  and 
vertical  shift  difference  pair.  This  2  value  vector  is  generally  assigned  to  the  second 
frame  and  the  reference  frame  is  artificially  forced  to  be  the  origin.  Shift  estimates  for 
multiple  frames  are  accomplished  by  repeating  this  process  for  different  secondary 
frames,  while  always  using  the  same  reference  frame  (generally  the  first  frame  in  the  set). 
Since  noise  varies  from  frame  to  frame,  this  method  yields  estimates  that  are  biased  by 
the  noise  of  the  reference  frame.  Robinson  and  Milanfar  discuss  the  Approximate 
Minimum  Average  Square  Difference,  Approximate  Maximum  Direct  Correlator,  Linear 
Gradient-Based  Method,  and  Multiscale  (Pyramid)  Gradient-Based  Method  estimators, 
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all  of  which  estimate  translational  shifts  between  two  frames  (8: 1189-1 190).  Kuglin  and 
Hines,  as  well  as  Gottesfeld  Brown  examine  a  phase  correlation  estimation  technique 
which  uses  Fourier  methods  to  determine  the  translational  offset  between  two  frames 
(4:163-165;  2:345-348).  Frischholz  and  Spinnler  discuss  a  template  matching  approach 
to  estimating  the  shift  difference  between  two  data  frames  (1:50-59).  Even  a  cursory 
survey  of  image  registration  literature  strongly  suggests  that  translational  shift  estimation 
theory  is  saturated  by  2  frame  based  estimation  approaches.  While  this  is  a  reasonable 
starting  point  for  multiframe  estimation  scenarios,  it  can  lead  to  performance  that  is 
highly  image  dependent  (i.e.  if  the  reference  frame  is  unusually  corrupted  then  the 
estimates  might  be  biased  by  the  image).  A  more  robust  approach  would  seek  to 
minimize  such  image  dependence. 

In  this  paper  a  new  multiframe  shift  estimation  method  is  presented.  The 
algorithm  is  capable  of  sub-pixel  estimation  and  is  robust  in  the  presence  of  noise. 
Mathematical  modeling  and  justification  for  these  claims  are  explained,  and  visual  results 
are  shown  for  further  support.  Furthermore,  the  total  mean-square  error  lower 
performance  bound  for  the  multiframe  case  is  derived  and  compared  to  the  performance 
of  this  estimator.  This  tool  offers  justification  for  establishing  points  of  diminishing 
returns  for  the  number  of  frames  used  in  this  shift  estimator. 
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II.  Sensor  Models 


Imaging  System  Model 

All  imaging  systems  consist  of  several  fundamental  pieces.  If  they  are  properly 
mathematically  modeled  and  assembled,  the  entire  system  can  be  simulated.  The  first 
piece  is  the  true  scene,  or  object,  which  represents  what  an  ideal  imaging  system  would 
see  (no  corruption  due  to  noise,  filtering,  etc).  Light  travels  from  the  object  to  the  sensor 
through  the  atmosphere.  Specific  path  conditions  can  vary,  so  an  atmospheric  transfer 
function  is  often  computed.  This  transfer  function  acts  as  a  filter  on  the  light,  attenuating 
content  at  different  visual  frequencies.  The  resulting  light  reaches  the  optical  lens  of  the 
sensor.  The  geometric  shape  of  the  lens  governs  the  optical  transfer  function  ( OTF ), 
which  low  pass  filters  the  visual  frequency  content.  These  two  transfer  functions  are 
often  combined  (by  the  properties  of  convolution)  to  become  a  single  transfer  function 
(6:45). 

After  passing  through  the  sensor  optics  the  filtered  light  falls  upon  the  charge- 
coupled  device  (CCD)  array.  It  is  comprised  of  many  small  detectors  which  each 
generate  an  electrical  charge  proportional  to  the  amount  of  light  on  it.  This  charge  is 
amplified  and  read  out  to  a  more  pennanent  storage  location  where  it  can  be  accessed  for 
post  processing  and  analysis. 
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Since  this  is  a  digital  signal  processing  problem,  it  is  important  to  address  the 
issue  of  proper  sampling  rates.  The  light  that  falls  upon  the  CCD  array  has  been  filtered 
such  that  there  is  a  cut  off  spatial  frequency  fc  above  which  no  content  exists.  According 
to  the  Nyquist  sampling  criteria,  for  all  existing  frequencies  to  be  kept,  a  sampling 
interval  of  at  least  1/(2  f  )  must  be  used.  Visual  frequency  content  is  measured  in  inverse 
distance  terms,  so  the  Nyquist  criteria  actually  dictates  the  necessary  physical  dimensions 
of  the  CCD  array  for  proper  sampling.  More  specifically,  the  Nyquist  criteria  detennines 
the  necessary  center  to  center  spacings  of  the  individual  collection  cells  on  the  CCD 
array.  For  the  remainder  of  this  thesis  it  will  be  assumed  that  the  Nyquist  sampling 
criteria  is  met,  and  therefore  no  aliasing  is  present. 

d  [x,  v )  =  [o  (x,  y )  *  h  ( x ,  y )]  +  n  (x,  y )  (1) 

The  data  generated  by  an  imaging  system  can  be  decomposed  into  several 
components.  Equation  1  shows  a  very  basic  relationship  between  the  true  scene  ofay) 
and  the  collected  data  d(x,y).  In  this  model  d(x,y)  is  based  on  a  photon  counting  process 
for  a  single  frequency,  and  is  therefore  a  matrix  of  positive  integers.  The  total  system 
noise  is  represented  by  n(x,y),  and  the  combined  impulse  responses  of  the  path  and 
optical  system  that  the  light  traveled  through  is  given  by  h(x,y).  The  convolution  of 
o(x,y)  and  h(x,y)  represents  the  spatial  frequency  attenuation  that  occurs  as  a  result  of  the 
atmospheric  transfer  function  and  the  OTF,  and  is  often  denoted  i(x,y)  with  the 
understanding  that  ofay)  is  the  true  scene  without  any  loss  of  spatial  frequency 
information,  or  blurring,  and  i(x,y)  is  the  blurred  version  of  ofay).  It  is  assumed  that  the 


5 


Nyquist  sampling  criteria  is  met  with  regards  to  the  collector  spacing  on  the  CCD  array 
and  the  maximum  spatial  frequency  in  i(x,y).  In  other  words,  the  CCD  array  is  able  to 
fully  capture  all  spatial  frequencies  present  in  i(x,y),  thereby  preventing  any  image 
aliasing.  Equation  2  condenses  equation  1  with  this  abbreviation  and  adds  numerical 
subscript  indices  to  indicate  the  data  frame  index  and  corresponding  noise. 

dl(x,y)  =  i(x,y)  +  nl(x,y)  (2) 

A  translational  shift  offset  of  (a,[:S)  between  two  frames  is  accounted  for  by  including  shift 
values  in  the  appropriate  dimensions,  as  shown  in  equation  3.  It  is  important  to  note  that 
the  noise  is  different  for  each  frame,  as  indicated  by  the  subscript  notation. 

d2(x,y)  =  i(x-a,y-j3)  +  n2(x,y)  (3) 


Minimum  Mean-Square  Error  Estimator 

Having  created  a  basic  model  for  the  data,  it  is  necessary  to  estimate  the  shift 
differences  between  two  data  frames.  The  process  for  finding  a  minimum  mean-square 
error  ( MSE)  estimator  involves  creating  the  mean-square  error  cost  function  and  then 
minimizing  it  with  respect  to  the  offset  between  the  two  frames.  It  is  important  to  note 
that  minimum  MSE  is  referring  to  the  minimization  of  error  between  the  estimation  and 
the  truth,  not  to  be  confused  with  any  other  possible  definitions  of  minimum  MSE  in 
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estimation  theory.  Equation  4  shows  the  mean-square  error  cost  function  associated  with 
a  CCD  array  of  dimensions  MxN,  where  (SX,S})  is  the  offset. 

MN  2 

C  ’5y  )  =  £  Z  (' dl  (*’  y)  -  d2  (Sx  +  X’  5y  +  e))  (4) 

x=l  y= 1 

Expanding  this  cost  function  permits  further  simplification,  as  shown  in  equation  5. 

MN 

c(4A)  =  X  X  [d'  (x>y)  ~  2di  ix^y)  d~  K + x’  5y + y) + dl  +  x,  sy + y )]  (5) 

x=l  y= 1 

The  first  term  can  be  eliminated  from  the  minimization  process  since  it  is  not  dependent 
on  (dx,dy),  the  parameters  used  to  accomplish  the  maximization  of  C.  The  third  tenn  is 
technically  dependent  on  (S x,Sy),  however  it  will  not  change  significantly  if  we  assume 
circular  convolution,  small  shifts,  or  that  new  information  entering  the  frame  from  the 
edges  has  similar  intensity  to  existing  information.  The  middle  term  remains  as  the  only 
expression  significantly  dependent  on  (Sx,Sy).  By  removing  the  negative  sign  and  the 
unnecessary  scalar,  it  can  be  simplified  such  that  the  MMSE  estimator  is  the  expression 
that  maximizes  equation  6. 

M  N 

C2{dx’dy)  =  XXdl(X’y)d2(Sx+X’  5v  +  y)  (6) 

x=\  y= 1 
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By  substituting  equations  2  and  3,  equation  6  can  be  rewritten  as  equation  7. 


M  N 

x=l  y=l 

i  (x,  y)  i(Sx+x-  a,  8y  +y-p) 

+i  (x,  y)  n2  ( 8X  +  x,  Sy  +  y )  (7) 

+i(Sx+x- a,5y  +y-fi) nx  (x, y) 
+nl(x,y)n2(Sx+x,Sy+y)\ 


Assuming  that  the  noise  is  spatially  uncorrelated  and  has  a  zero  mean,  such  as 
Gaussian  noise,  the  expectation  operator  further  simplifies  the  cost  function.  Equations 
8,  9,  and  10  are  a  result  of  this  assumption. 


i(x,  y)  n2  (dx  +  x,  8y  +  y ) J  =  i (x,  y ) E (8X  +  x,  Sy  +  y ) 


=  0 


(8) 


+  x-a,S, 


+  y-  P)rh{x,y)\  =  i($} 


+  x-a,8. 


+  y-fi)E[nl(x,yj]  =0  (9) 


I  (a  y) n2  (8X  +  x, 8y  +  y )]  =  E  [nt  (x, y )]  E  [n2  (dx  +  x, 8y  +  v)]  =  0  (1 0) 


Substituting  equations  8,  9,  and  10  into  equation  7  yields  equation  11. 


M  N 


C2(8x,5y)  =YjYjE  i(x,y)i(8x  +  x-a,8y+y-p) 


x=\  y= 1 


(ID 
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This  is  merely  the  autocorrelation  of  the  image  with  lags  (Sx-a,dv-/3),  and  is  maximized 
when  Sx=a  and  Sv=f:l.  Therefore,  under  the  assumption  that  the  noise  is  uncorrelated  and 
zero  mean,  the  MMSE  shift  estimate  is  the  set  of  parameters  that  maximizes  the  2 
dimensional  cross  correlation  of  the  two  data  frames.  Furthennore  the  expected  value  of 
the  cross  correlation  of  dj(x,y)  and  d2(Sx+x,dy+y)  equals  the  autocorrelation  of  the 
underlying  true  scene  with  lags  (dx-a,Sy-fi),  which  means  that  the  expected  value  of  the 
estimator  is  the  true  shifts. 


Probability  Based  Image  Modeling 

Digital  images  can  be  modeled  as  a  random  process,  where  each  pixel  is  a  random 
variable.  For  a  particular  scene  all  pixels  are  governed  by  the  same  general  probability 
distribution  form,  but  each  pixel  distribution  has  unique  parameters.  These  parameters 
are  a  function  of  the  underlying  noiseless  true  scene.  More  specifically,  for  each  pixel, 
the  mean  of  the  distribution  is  the  value  of  the  corresponding  true  scene  pixel.  Modeling 
an  image  as  a  joint  distribution  function  provides  a  way  to  compute  the  total  probability 
of  collecting  a  particular  data  frame  from  a  true  scene.  Assuming  that  the  pixels  of  an 
image  are  independent  random  variables,  the  probability  of  receiving  a  particular  data 
frame  is  the  product  of  all  the  individual  probabilities  of  the  pixels  within  that  data  frame. 

Two  common  distributions  often  used  in  this  modeling  approach  are  the  Poisson 
distribution  and  the  Gaussian  distribution.  Photon  arrival  is  a  Poisson  random  process. 


9 


Therefore  in  situations  where  most  of  the  noise  is  a  result  of  the  light  collection  process, 
equation  12,  which  defines  the  total  probability  of  receiving  data  frame  D  from  true  scene 
i  with  Poisson  noise  and  random  translational  shifts  of  <a,fi),  is  a  reasonable  distribution 
choice. 


P{d  =  D\a,p)  =  YlY[ 

x  y 


i{x,y)D{x-a'y^]  e^’y) 

D(x-a,y-jB)\ 


(12) 


At  light  levels  where  the  Poisson  distribution  becomes  nearly  symmetric  (i.e.  more  than 
100  photons  per  pixel)  the  shape  of  the  Poisson  distribution  becomes  similar  to  that  of  the 
Gaussian  distribution.  Therefore  the  Gaussian  distribution  is  often  used  to  approximate 
Poisson  noise.  If  light  levels  are  somewhat  constant  throughout  the  image  then  a  single 
variance  parameter  can  be  used  to  approximate  the  individual  pixel  standard  deviations. 
Equation  13  shows  the  total  probability  of  receiving  data  frame  D  from  true  scene  i, 
where  on  is  the  standard  deviation  of  the  noise  and  (a,[i)  are  the  random  translational 
shifts. 


P[d  D\a,P )  r— 

x  y  V 


2a‘, 


\D(x-a,y-p)-i(x,y)f 


(13) 


Since  the  likelihood  of  the  Gaussian  distribution  is  maximized  at  the  mean,  minimizing 
the  MSE  maximizes  the  likelihood  of  the  Gaussian  model.  Therefore  the  estimator  in 
equation  1 1  is  the  maximum  likelihood  estimator  in  the  presence  of  Gaussian  noise. 
Since  the  expected  value  of  the  data  is  the  true  scene,  one  of  the  data  frames  can  be  used 
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as  a  true  scene  estimate  in  the  Gaussian  model,  and  the  other  data  frame  can  be  used  as 
the  data  with  a  shift  relative  to  the  scene  estimate.  Therefore,  since  the  cross  correlator 
estimates  the  true  shifts  (on  average)  for  zero  mean  noise  such  as  this,  it  minimizes  the 
cost  function  in  equation  4  and  is  therefore  an  unbiased  shift  estimator  in  the  presence  of 
Gaussian  noise. 
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III.  Multiframe  Shift  Estimator  Algorithm 


Existing  multiframe  shift  estimation  algorithms  are  effectively  only  a  series  of  2 
frame  shift  estimators.  For  a  set  of  K  data  frames,  a  single  reference  data  frame  is 
selected  and  the  remaining  K- 1  data  frames  are  registered  to  it.  The  performance  of  this 
method  is  highly  dependent  on  the  reference  data  frame  chosen,  and  therefore  can  yield 
poor  results  if  the  reference  data  frame  is  unusually  corrupted  by  noise.  Furthermore, 
since  each  data  frame  is  uniquely  corrupted,  any  reference  data  frame  chosen  will  skew 
performance  by  the  noise  present  in  that  frame.  This  approach  to  multiframe  shift 
estimation  fails  to  use  all  available  information.  A  better  approach  is  one  that  exploits  the 
fact  that  shift  estimates  will  vary  depending  on  which  frame  is  used  as  a  reference  data 
frame. 


Initial  Approach  To  Solution 

In  order  to  compute  meaningful  shift  estimates,  all  estimators  must  choose  an 
origin  to  provide  a  common  reference  for  the  shift  values.  2  frame  based  multiframe 
estimators  force  the  reference  data  frame  to  be  located  at  the  origin  and  yield  estimates 
(for  the  remaining  data  frames)  that  are  relative  to  the  reference  data  frame.  Multiple  sets 
of  shift  estimates,  where  each  set  contains  shift  estimates  for  the  entire  data  set  using  a 
different  reference  data  frame,  can  be  compared  by  aligning  the  sets  to  the  same  reference 
point.  This  can  be  accomplished  by  normalizing  each  set  to  be  zero  mean  in  both  shifting 
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dimensions.  Figure  1  shows  several  examples  of  this  for  10  frames  of  data,  using  the  first 
and  second  frames  as  the  references  with  the  cross  correlation  shift  estimator. 


Multiframe  Estimates: 
Frame  1  Reference 


Multiframe  Estimates: 
Frame  2  Reference 


Muitiframe  Estimates: 

Frame  1  Reference  (Mean  Rerroved) 


Muitiframe  Estimates: 

Frame  2  Reference  (Mean  Rerroved) 


Horizontal  Axis 


Figure  1 .  Examples  of  2  frame  based  multiframe  shift  estimation  using  the  cross 
correlation  shift  estimator  where  K  equals  10.  The  black  diamond  indicates  the  mean  of 

the  scatter. 

If  this  technique  is  repeated  for  the  same  data  set,  using  a  different  reference  data  frame 
each  time,  two  K  by  K  matrices  are  populated  (one  for  the  horizontal  shift  estimates  and 
the  other  for  the  vertical  shift  estimates)  where  K  is  the  total  number  of  data  frames. 
Continuing  from  the  example  in  figure  1,  figure  2  demonstrates  the  resulting  10  sets  of 
shift  estimates  (with  respective  means  removed)  plotted  together.  Here  the  small 
differences  in  shift  estimates  (due  to  using  different  reference  data  frames)  are  visible. 
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Since  each  data  frame  is  uniquely  corrupted  by  noise,  rendering  a  slightly  different  set  of 
relative  shift  estimates  when  used  as  the  reference  data  frame,  the  shift  estimates  form 
clusters  surrounding  the  true  shift  values. 


Horizontal  Dimension 


Figure  2.  Example  of  K  2  frame  based  multiframe  shift  estimations  where  a  different 
reference  data  frame  is  used  for  each  estimation,  and  where  K  equals  10.  Bold  x 
characters  indicate  the  mean  of  each  cluster  of  shift  estimates. 


As  the  number  of  data  frames  used  in  this  technique  increases,  the  clusters  become  more 
densely  populated.  The  new  shift  estimates  are  the  means  of  the  clusters,  and  have  a 
mean  of  zero  in  both  shifting  dimensions. 
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This  new  multiframe  shift  estimator  is  practically  implemented  by  using  an 
unbiased  2  frame  shift  estimator  to  estimate  shift  differences  between  all  combinations  of 


2  data  frames.  As  described  before,  this  can  also  be  thought  of  as  using  a  2  frame  based 
multiframe  shift  estimator  K  times  for  a  data  set  of  K  frames,  using  a  different  reference 
data  frame  each  time.  Each  set  of  shift  estimates  (estimates  for  the  entire  data  set  using  a 
particular  reference  data  frame)  is  nonnalized  to  have  a  mean  at  the  origin.  Graphically 
this  yields  K  clusters  of  K  estimates  each.  The  mean  of  each  cluster  is  stored  as  the  final 
estimate. 


Mathematical  Model 

The  mathematical  model  for  this  new  multiframe  shift  estimator  begins  with  an 
unbiased  2  frame  shift  estimator  capable  of  estimating  vertical  and  horizontal 
translational  shifts  between  the  2  frames.  This  estimator  is  referred  to  as  the  initial 
estimator,  and  is  modeled  as  M(i,j)=  (vertical  estimate, horizontal  estimate)  or 
M i ( ij) = (vertical  estimate)  and  M2(i,j)= (horizontal  estimate).  Inputs  (i,j)  are  the  indices 
of  the  two  data  frames  being  registered,  where  i  is  the  (index  number  of  the)  reference 
data  frame  and  j  is  the  (index  number  of  the)  data  frame  being  registered  to  it.  The 
vertical  and  horizontal  estimate  outputs  are  each  real  numbers  with  a  unit  of  one  pixel 
(i.e.  vertical  shift  =  1.2  indicates  a  vertical  shift  difference  of  1.2  pixels)  and  a  maximum 
absolute  value  of  one  half  the  total  number  of  pixels  per  dimension  (i.e.  for  data  with 
dimensions  256  by  256  a  vertical  or  horizontal  shift  estimate  of  350  is  not  possible,  since 
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a  shift  of  that  size  would  indicate  that  there  is  no  overlapping  content  between  the  two 
data  frames  being  registered).  Output  sign  convention  is  slightly  different  than  the 
standard  2  dimensional  sign  convention,  and  is  described  in  Figure  3. 


Data  Frame  1 


Data  Frame  2 
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Positive  Horizontal  Shift 


Figure  3.  Sign  convention  of  the  new  multiframe  shift  estimator  model. 


The  initial  estimator  performs  shift  estimates  for  all  combinations  of  2  data  frames  in  the 
set  of  K  data  frames.  This  is  efficiently  represented  in  equation  14  where  shift  estimates 
for  each  dimension  populate  a  K  by  K  matrix.  Each  row  represents  the  result  of  a  2 
frame  based  multiframe  shift  estimation. 
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Mx  (l,l)  Mj(  1,2)  •••  MX{\,K) 

Mx  (2,l)  M,(2,2)  Af,(2,/f) 

M^l)  M,(^,2)  •••  MX{K,K) 

M2{  1,1)  M2  (1,2)  •••  M2(l,if) 
M2(2,l)  M2  (2,2)  M2(2,7f) 

M2(^,l)  M2{K,  2)  •••  M2(K,K) 


The  mean  of  each  row  is  subtracted  from  all  members  of  its  row,  as  shown  in  equation 
15. 


Md,fiu)-T2X.(u) 

V  K  1=1  J 

V  K  i= 1  ) 


M^K)~  t1.'') 

v  K  1= 1  y 

v  ^  1=1  y 


(15) 


The  mean  of  each  cluster  is  acquired  by  computing  the  mean  of  each  column  of  equation 
15.  This  results  in  2  vectors  of  length  K,  representing  the  final  shift  estimates  for  all  K 
data  frames,  where  the  mean  of  the  shift  estimates  for  both  dimensions  is  zero,  and  where 
each  vector  contains  the  shift  estimates  for  a  particular  dimension.  Equation  16  shows 
this  step  and  condenses  the  operation  into  a  more  compact  equation. 

Mdimk  ( dataframex dataframeK  )  represents  the  new  multiframe  shift  estimator,  where 

dim  is  the  dimension  of  operation,  k  refers  to  the  data  frame  whose  shift  is  being 
estimated,  and  dataframei,..., dataframeK  are  the  estimator  inputs. 
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Mdim k  ( dataframex ,...,dataframeK ) 

^  K  i= 1  V  K  7= 1 

=7TS[(/f-Ma..(i^))-EMdto(o') 

A  «=i  v  7=1  y 


J_ 

~K^ 


7=1 


7=1 


7=1 

v  1=1  7  v 1=1  7=i  > 


(16) 


Further  simplifications  can  be  made  an  important  property  is  true  of  the  initial 
estimator.  This  property  (equation  17)  states  that,  for  shift  difference  estimation  between 
2  data  frames,  switching  the  reference  data  frame  results  only  in  a  sign  change  of  the 
estimate.  If  this  property  is  true  then  the  matrices  (populated  by  the  initial  estimator)  in 
equation  14  gain  the  property  such  that  each  is  equal  to  the  negative  of  its  own  transpose. 
If  equation  17  is  true  then  equation  18  is  also  true.  It  states  that  the  shift  estimate  of  a 
data  frame  with  itself  yields  zero  in  both  dimensions.  This  means  that  the  diagonals  of 
the  matrices  in  equation  14  equal  zero. 


M(i,j)  =  -M(j,i) 


(17) 
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M(i,i)  =  0 


(18) 


Finally,  if  equation  17  is  true  then  a  corollary  property  becomes  true.  Equation  19  states 
that  the  sum  of  all  values  in  each  matrix  is  zero. 


K  K 

EZMdim(h7')  =  0 


i=l  7=1 


(19) 


Substituting  this  equality  into  equation  16  simplifies  the  expression  for  the  final  shift 
estimates  to  be  as  described  in  equation  20. 


Mdim k  ( dataframe x dataframeK ) 
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K 

J 
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J_ 
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f  K  \  (  K  K  h 

K'LM^Uk)  - 

V  1=1  J  V 1=1  J=1  J 


V  1=1 

XMdim  (i’k) 


-0 


i= 1 


(20) 


This  means  the  initial  shift  estimates  do  not  need  to  be  nonnalized  prior  to  averaging  the 
columns. 
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This  new  multiframe  shift  estimator  is  independent  of  any  underlying  random 
shifting  model.  If  prior  information  is  known  about  the  type  of  random  shifting 
occurring,  it  can  be  factored  into  the  choice  of  initial  estimator  for  this  new  multiframe 
shift  estimator. 


Speed  Of  Implementation 

Implementation  of  the  new  multiframe  shift  estimator  involves  selecting  an 
unbiased  2  frame  initial  shift  estimator,  computing  the  shift  differences  between  all 
combinations  of  2  data  frames,  and  averaging  all  results  with  respect  to  dimensionality 
and  frame  index.  If  equation  17  holds  then  significant  speed  improvements  can  be  made. 
It  has  been  shown  that  the  cross  correlation  operation  is  an  unbiased  2  frame  shift 
estimator  and  it  satisfies  equation  17.  Based  on  its  merits,  the  cross  correlation  shift 
estimator  is  an  excellent  candidate  for  the  initial  estimator. 

Since  equation  17  is  true  of  the  cross  correlator,  implementation  becomes  much 
faster.  Rather  than  performing  K“  estimations  with  the  initial  estimator,  only  the  upper 
triangular  portion  of  the  two  initial  shift  estimate  matrices  must  be  filled.  The  symmetry 
in  the  matrices  reduces  the  number  of  estimations  from  K“  to  (K  -K)/2  because  the 
diagonal  of  each  matrix  equals  zero,  and  because  only  the  upper  (or  lower)  triangle  of  the 
matrices  needs  to  be  populated  since  the  opposing  triangle  equals  its  negative  transpose. 
After  populating  the  initial  shift  estimate  matrices,  final  estimates  are  computed  by 
averaging  the  columns. 
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Unbiased  Property 


For  this  thesis  bias  is  defined  as  the  mean-square  error  (MSE)  of  the  estimator 
without  noise.  This  value  is  computed  by  centering  the  mean  of  the  shift  estimates  on  the 
mean  of  the  true  shift  values,  calculating  the  distance  squared  between  each  true  shift 
value  and  it’s  corresponding  shift  estimate  (for  each  dimension),  summing  these  squared 
distances,  and  dividing  the  sums  (one  for  each  dimension)  by  the  number  of  data  frames 
used.  Equation  21  models  this  metric  as  MSEihm  K  where  K  is  the  number  of  data  frames 

used,  and  where  trues hiftdim  (  is  comprised  of  2  vectors  of  length  K  containing  the  true 
shift  values  for  each  dimension  and  data  frame. 
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(21) 


Equation  20  shows  that  the  multiframe  estimate  is  an  average  of  unbiased 
estimates  of  the  shifts.  If  the  initial  estimator  is  unbiased  then  the  new  multiframe  shift 
estimator  is  also  unbiased.  The  cross  correlation  estimator  is  unbiased.  Therefore,  if  the 
cross  correlation  estimator  is  used  as  the  initial  estimator,  the  new  multiframe  estimator  is 
unbiased.  Furthermore,  as  K  gets  larger,  the  MSE  of  the  estimates  compared  to  the  true 


21 


shifts  approaches  zero.  This  property  can  be  demonstrated  experimentally  by  simulating 
data,  estimating  the  random  shifts,  comparing  the  truth  to  the  estimates  to  acquire  MSE 
information  for  each  experiment,  and  averaging  the  MSE  information  over  all  the 
experiments  to  determine  if  any  bias  is  present  in  the  estimator. 

The  first  step  in  simulating  data  is  to  select  a  true  scene.  This  matrix  of  positive 
real  integers  is  the  uncorrupted  Nyquist  sampled  digital  representation  of  a  scene.  It  is 
then  filtered  by  the  average  atmospheric  transfer  function  and  optical  transfer  function 
using  fast  Fourier  techniques.  The  result  is  randomly  shifted  according  to  the  2 
dimensional  Gaussian  distribution  with  a  mean  at  the  origin  and  with  variances  of  2 
pixels  for  both  dimensions.  Fast  Fourier  techniques  are  used  to  perform  circular  shifts 
and  the  result  is  cropped  to  smaller  dimensions  to  eliminate  unrealistic  border  effects. 
For  this  set  of  experiments  a  simulated  point  source  is  used  because  it  has  high  contrast 
(higher  contrast  tends  to  yield  better  shift  estimates).  The  true  scene  has  dimensions  of 
64  pixels  by  64  pixels  and  the  final  simulated  data  frame  has  dimensions  of  32  pixels  by 
32  pixels.  Figure  4  shows  the  true  scene  cropped  to  32  pixels  by  32  pixels  but  without 
any  corruption.  Using  this  method  of  simulation,  500  unique  sets  of  data  were  generated 
for  each  K  between  3  and  20,  totaling  103,500  uniquely  shifted  data  frames.  The  new 
multiframe  shift  estimator,  using  the  cross  correlation  initial  estimator,  generated  shift 
estimates  for  each  of  the  500  data  sets  per  K.  The  estimated  shifts  were  compared  to  the 
true  shift  values  to  compute  the  MSE  metric,  and  the  500  MSE  metrics  per  K  were 
averaged.  Figure  5  shows  the  average  MSE  (per  dimension)  as  a  function  of  the  number 
of  data  frames  used. 
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True  Scene:  Simulated  Point  Source 
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Figure  4.  Simulated  point  source  used  to  generate  data  and  experimentally  measure  any 

bias  in  new  multiframe  shift  estimator. 

It  also  shows  the  best  fitting  curve  of  the  form  MSE(K)  -a*  e~b*K  +  c  *  e~d*K  for  each  data 
set  to  demonstrate  convergence  toward  zero.  This  functional  form  is  used  because  it 
closely  follows  the  trend  of  the  data  and  is  a  simple  form  to  mathematically  manipulate. 
As  the  independent  variable  increases  the  MSE  converges  to  zero.  Therefore,  with  no 
noise  present,  the  estimator  is  unbiased  as  K  grows  large. 
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Bias  (MSE  of  estimator  in  dimension  1)  Modeling  for  Bias  vs  Data  Frames 

Dimension  1 


Dimension  2 


Figure  5.  Simulated  point  source  used  to  generate  data  and  experimentally  measure  any 

bias  in  new  multiframe  shift  estimator. 
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IV.  Comparison  With  Existing  Algorithms 


Lower  Bound  Of  Performance 

A  well  documented  and  broadly  respected  performance  limit  in  the  field  of 
estimation  theory  is  the  total  mean-square  error  of  the  estimator.  This  is  the  lowest 
possible  total  mean-square  error  that  any  unbiased  estimator  can  achieve  for  a  given 
distribution  model.  It  can  be  used  to  show  that  the  lower  bound  on  shift  error  improves 
with  adding  frames,  and  it  can  infer  the  relative  performance  improvement  obtainable. 
Van  Trees  derives  it  by  modifying  the  nonrandom  parameter  boundary  derivation  such 
that  prior  infonnation  is  accounted  for.  Equation  22  shows  the  total  information  matrix 
Jt  where  Jd  is  the  information  matrix  based  on  the  data  and  derived  in  the  nonrandom 
parameter  case,  and  Jp  is  the  prior  information.  Jd  is  also  known  as  Fisher ’s  information 
matrix  ( FIM )  which  is  used  to  compute  the  Cramer  Rao  lower  bound  (CRLB)  of 
variance. 


J"  j. 


J  D  J  P 


(22) 


Van  Trees  defines  the  elements  of  Jp  as 


J„ 


d2  In  Pa  (A) 

dAfiAj 


(23) 
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and  states  that  the  “diagonal  elements  in  the  inverse  of  the  total  information  matrix  are 
lower  bounds  on  the  corresponding  mean-square  errors”  (9:84)  as  shown  in  equation  24, 
where  J“r  is  the  ith  diagonal  element  of  the  inverse  of  J  r. 


£[<]>4  (24) 

In  the  multiframe  shift  estimation  problem  the  lower  perfonnance  bound  can  be 
determined  as  a  function  of  the  number  of  data  frames  used,  the  parameters  of  the  random 
shifting  model,  and  the  gradient  infonnation  of  the  true  scene.  Since  there  are  two 
dimensions  of  random  translational  movement  it  is  necessary  to  use  the  multiple 
parameter  model  of  JD.  The  elements  of  JD  are  defined  in  equation  25  (9:79). 


JDii  =  ~E 
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It  is  assumed  that  any  necessary  conditions  for  equation  24  to  be  true  are  met  (9:79-80). 

The  random  translational  shifts  of  a  set  of  data  can  be  modeled  as  the  joint 
likelihood  function  P(a,/3\D)  where  a  and  [[  represent  the  horizontal  and  vertical  random 
translational  shifts,  respectively,  and  D  is  the  true  scene.  This  expression  can  be 
rewritten  by  applying  Bayes  theorem,  as  shown  in  equation  26. 
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p(clP\d)  = 
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The  natural  log  operator  further  simplifies  this  function,  as  shown  in  equation  27. 
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In  the  first  term  the  translational  shifts  are  not  treated  as  random  variables  but  rather 
parameters  of  the  true  scene,  which  is  modeled  as  a  random  matrix.  More  specifically, 
the  first  term  models  the  true  scene  as  the  average  value  of  random  noise.  The  FIM  is 
computed  from  this  tenn,  which  means  that  the  lower  bound  of  the  variance  is  image 
dependent.  The  Poisson  distribution  becomes  nearly  symmetric  when  its  parameter 
becomes  large,  and  begins  to  take  on  a  shape  similar  to  the  Gaussian  distribution.  If  there 
is  sufficient  light  in  a  particular  pixel  (i.e.  more  than  100  photons)  then  the  Gaussian 
distribution  can  be  used  to  approximate  the  noise.  Therefore  the  likelihood  function  of 
the  first  term  is  modeled  as  follows: 


P(D\a,l)  =  U  rJ~  exP 
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on  is  the  standard  deviation  of  the  noise,  A  is  data  frame  k,  and  f(m-(/k,n-fik)  is  the  true 
scene  shifted  by  the  true  shift  values  of  data  frame  k.  The  log  likelihood  expression  is 


ln(p(£>  | «,/?)) 

=  -K  In  (  VAct;i  )  -  — -y  ^ 


Z(  A  n)-f(m-cck,n  -fik  )f 


(29) 


where  the  first  term  is  just  a  constant.  The  partial  derivative  operator  is  also  known  as  the 
gradient,  and  must  be  broken  into  3  steps  in  order  to  fully  account  for  all  combinations  of 
the  shift  parameters.  In  all  three  cases,  the  constant  term  disappears  and  the  gradient  of 
the  second  tenn  is  all  that  remains,  resulting  in  the  following  three  equations: 
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For  conciseness  of  notation  the  substitutions  /  =  f{m-  ak,n-  pk )  and 
Dt.  =  Dk(m,n)  are  made.  Since  the  shift  values  are  treated  as  parameters,  and  since  they 

are  different  for  each  data  frame,  the  gradient  operations  are  expressed  in  a  more  general 
form.  The  expression  greatly  simplifies  when  the  expectation  operator  is  applied  to  the 
log  likelihood  function.  This  is  because  the  only  random  element  of  the  function  is  Da, 
which  has  a  mean  of  the  true  scene.  Therefore  the  expected  value  of  (d,  -  /)  is  zero, 
leaving  only  the  gradient  information  of  the  true  scene. 
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This  information  is  more  clearly  and  completely  represented  in  matrix  form  in 


equation  35  where  the  substitutions 


are  made  for  more  notation 


brevity.  In  this  form  all  possible  combinations  of  two  gradients  are  accounted  for  (i.e. 
gradient  of  true  scene  with  respect  to  a,  shift  times  gradient  of  true  scene  with  respect  to 
Pi  shift). 
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The  FIM  for  Gaussian  noise  is  originally  derived  by  Robinson  and  Milanfar  (8:1186- 
1187). 

The  second  term  in  equation  27  accounts  for  any  prior  knowledge  of  the 
translational  shifts.  Unlike  Jd,  here  the  translational  shifts  are  treated  as  random  vectors 
which  can  be  statistically  modeled.  For  this  derivation  a  Gaussian  distribution  is  used  to 
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describe  the  translational  shifts.  Equation  36  expresses  the  random  translational  shifts  in 


a  general  2  dimensional  Gaussian  form. 
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This  model  assumes  that  the  shifts  occur  with  a  mean  of  zero.  Furthermore,  this  model 
allows  for  different  variances  to  govern  each  dimension  of  shifting.  The  natural  log 
operator  simplifies  the  expression  to  become  the  sum  of  2K+\  terms  where  all  but  one 
contain  a  random  variable,  as  shown  in  equation  37. 
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The  partial  derivative  operators  distribute  to  each  term,  immediately  eliminating  the 
constant  tenn.  Once  again  the  results  are  dependent  on  the  reference  of  the  derivatives. 
Equations  38-42  show  the  results  of  all  possible  combinations  of  derivative  operators. 
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Many  results  are  conveniently  canceled  out,  leaving  only  2 K  non-zero  results.  These 
terms  pass  through  the  expectation  operator  and  finally  result  in  the  following  two  very 
simple  expressions: 
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Using  the  same  matrix  notation  form  as  before,  Jp  is 
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The  third  term  in  equation  27  is  the  easiest  to  solve  of  the  three.  Since  the  data 
does  not  contain  shift  parameters,  the  partial  derivative  operators  with  respect  to  the  shift 
parameters  cancel  out  this  term  completely. 

Given  that  image  noise  is  Gaussian  with  a  mean  of  the  true  scene,  and  given  that 
the  random  translational  shifts  occur  according  to  a  Gaussian  model  with  a  mean  of  zero, 
the  final  expression  for  the  total  mean-square  error  of  the  multiframe  shift  estimation 
scenario  is 
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It  can  be  shown  that  fa  =  fx  and  fp  =  fy  for  all  In  other  words,  since  all  a  values  are 

in  the  same  dimension,  taking  the  gradient  of  an  image  with  respect  to  any  a  value  is  the 
same  as  taking  a  gradient  with  respect  to  the  dimension  of  a.  The  same  is  also  true  of  [[. 
This  property  simplifies  the  final  result  to  be  as  shown  in  equation  47,  where  the  diagonal 
elements  of  the  inverse  of  matrix  Jj  are  the  estimation  performance  bounds  for  all 
random  shifts.  This  performance  boundary  is  estimator  independent  and  therefore 
represents  the  best  possible  performance  that  any  estimator  can  achieve. 
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If  the  variance  of  the  shifts  is  large  then  the  prior  information  effectively  cancels 
out,  leaving  JT  equal  to  the  FIM,  which  yields  the  Cramer  Rao  lower  bound  (CRLB)  of 
variance.  Flowever  since  the  FIM  is  based  only  on  the  data,  it  is  a  lower  bound  than  the 
actual  performance  bound  (which  could  only  be  computed  if  all  random  elements  of  the 
imaging  process  were  included).  In  the  case  of  the  new  multiframe  shift  estimator  the 
CRLB  is  useful  since  the  estimator  does  not  assume  a  particular  (non-uniform)  shifting 
model.  This  means  that  the  estimator  assumes  the  shifting  variance  is  infinite  (or  rather 
that  the  shifting  distribution  is  uniform). 

Estimator  Variance 

Equation  47  provides  a  reference  for  estimator  variance  comparison. 
Experimentally  acquiring  the  variance  of  the  estimator  involves  selecting  a  true  scene 
with  gradient  infonnation  that  is  greater  than  zero,  and  selecting  an  imaging  system 
model  that  samples  the  scene  well  enough  to  prevent  aliasing.  In  order  to  observe 
differences  in  the  estimator  results  it  is  necessary  to  vary  the  noise  of  each  frame  in  a  set 
of  data,  without  changing  the  random  shifts  of  each  frame.  Figure  6  shows  the  results  of 
this  approach  where  a  set  of  K  data  frames  are  uniquely  corrupted  by  Gaussian  noise 
(with  a  standard  deviation  of  3.5  photons)  500  times  and  the  random  translational  shifts 
are  estimated  each  time  by  the  new  multiframe  shift  estimator.  The  same  true  scene 
(figure  4)  and  imaging  system  simulator  as  the  estimator  bias  experiment  is  used  for  this 
experiment.  The  average  variance  of  the  500  shift  estimates  per  data  frame  is  retained, 
and  those  K  values  are  averaged  to  become  the  empirically  discovered  estimator  variance 
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for  K  data  frames.  K  is  varied  from  3  to  20  data  frames  with  the  expectation  that  the 
variance  of  the  estimator  will  decrease  as  the  number  of  data  frames  used  in  the 
estimation  increases.  Finally,  since  only  one  set  of  random  shifts  is  used  in  this 
experiment,  and  since  the  initial  estimator  performance  can  be  influenced  by  individual 
sets  of  data,  this  entire  experiment  is  run  50  times,  changing  the  random  translational 
shifts  each  time. 
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Modeling  for  Performance  Bound  vs  Data  Frames 
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Figure  6.  Experimentally  detennined  variance  of  the  new  multiframe  shift  estimator, 
compared  to  the  total  mean-square  error  lower  bound  of  infinite  shifting  variance,  for  3  to 
20  data  frames,  for  estimation  of  the  horizontal  shift  on  frame  1  of  the  data  set.  This 
bound  is  effectively  the  CRLB  of  estimation  variance. 
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In  figure  6  both  the  estimator  performance  and  the  bound  are  modeled  by  a  double 


exponential  functional  form,  which  retains  the  general  trend  of  the  data,  and  allows  for 
simple  analysis  since  both  are  of  the  same  form.  The  gap  between  the  performance  and 
the  bound  (in  figure  6)  is  a  combined  result  of  the  type  of  image  used,  the  failure  of  the 
cross  correlation  initial  estimator  to  account  for  any  prior  shifting  information,  and  the 
fact  that  the  bound  does  not  include  scene  estimation  in  its  model. 


Analysis  Of  Results 

This  perfonnance  bound  offers  a  useful  tool  for  finding  points  of  diminishing 
returns  of  perfonnance  from  adding  frames.  For  a  given  type  of  data  this  performance 
bound  is  computed  and  fit  with  a  curve  model.  The  percent  of  improvement  due  to 
adding  additional  frames,  relative  to  perfonnance  at  a  particular  starting  K  value,  can  be 
quickly  computed  and  offers  the  analyst  practical  insight  into  the  potential  improvement 
that  can  be  gained  by  using  more  data  frames  in  the  estimation.  This  same  technique  can 
be  applied  to  the  estimator  performance.  Figure  7  shows  several  examples  of  this  tool, 
using  the  estimator  performance  and  perfonnance  bound  shown  in  figure  6.  Figure  8 
shows  the  cumulative  version  of  figure  7,  where  the  improvement  approaches  100%  as  K 
grows  large. 
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Performance  Compared  to  Bound: 
Percent  Improvement  Relative  To  K=3 


K  (#  of  data  frames) 
Performance  Compared  to  Bound: 
Percent  Improvement  Relative  To  K=20 


Performance  Compared  to  Bound: 
Percent  Improvement  Relative  To  K=10 


K  (#  of  data  frames) 
Performance  Compared  to  Bound: 
Percent  Improvement  Relative  To  K=50 


Figure  7.  Percent  improvement  of  the  estimator  performance  and  performance  bound  due 
to  using  more  data  frames  in  the  estimation.  This  is  based  on  the  estimator  performance 
and  performance  bound  in  figure  6.  For  instance,  adding  10  data  frames  when  starting 
with  3  will  improve  the  potential  performance  bound  much  more  significantly  than 
adding  10  frames  when  starting  with  50. 
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Performance  Compared  to  Bound: 
Percent  Improvement  Relative  To  K=3 


Performance  Compared  to  Bound: 
Percent  Improvement  Relative  To  K=10 


Performance  Compared  to  Bound: 
Percent  Improvement  Relative  To  K=20 


K  (#  of  data  frames) 


Performance  Compared  to  Bound: 
Percent  Improvement  Relative  To  K=50 


Figure  8.  Cumulative  percent  improvement  of  the  estimator  performance  and 
performance  bound  due  to  using  more  data  frames  in  the  estimation.  This  is  based  on  the 
estimator  performance  and  performance  bound  in  figure  6,  and  is  the  cumulative  result  of 

the  curves  in  figure  7. 


Point  sources  have  higher  gradients  than  any  other  realistic  images  of  the  same 
dimensions  and  amount  of  light.  According  to  equation  47  this  means  that  point  source 
images  yield  the  lowest  possible  performance  bounds.  Since  figure  4  is  a  point  source, 
then  the  performance  bound  in  figure  6  is  the  lowest  possible  bound  for  32  pixel  by  32 
pixel  images  with  the  same  total  amount  of  light,  making  this  bound  image  independent. 


In  realistic  situations  the  true  scene  is  not  known.  Therefore  it  is  necessary  to  use 
an  estimate  of  the  true  scene  to  compute  the  performance  bound  for  a  particular  set  of 
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data.  Common  practice  is  to  use  either  the  frame  average  of  a  registered  set  of  data  or  an 
individual  data  frame  depending  on  the  severity  of  noise  corruption. 


Estimator  Comparison  Based  On  Likelihood  Of  Scene  Estimate 

It  has  been  shown  that  a  data  frame  can  be  modeled  as  a  joint  random  function, 
where  the  total  likelihood  of  receiving  a  particular  data  frame  is  the  product  of  all  the 
individual  probabilities  of  the  pixels  for  the  amount  of  light  received,  respectively. 
Equation  12  models  a  particular  data  frame  according  to  the  Poisson  distribution. 
Equation  48  shows  how  the  natural  log  operator  simplifies  this  expression  without 
altering  its  monotonicity. 


]n(p(D\a,j3)) 

x  y 

The  joint  probability  modeling  approach  provides  a  common  frame  of  reference 
to  compare  scene  estimates  obtained  from  a  single  data  set.  It  is  therefore  reasonable  to 
argue  superiority  of  one  scene  estimate  over  another  on  the  grounds  of  higher  likelihood 
of  occurrence.  For  example,  for  two  different  scene  estimations  generated  from  the  same 
data  set,  the  scene  estimation  with  a  higher  likelihood  is  the  more  probable  estimate. 
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Since  frame  registration  is  a  crucial  part  in  scene  estimation,  and  since 
translational  shift  estimation  is  the  fundamental  element  of  registration,  this  modeling 
approach  provides  a  way  to  compare  shift  estimation  performance  of  different  shift 
estimators  using  real  data.  Frame  averaging  is  a  simple  and  effective  scene  estimation 
method.  It  involves  translational  shift  estimation  and  removal  (registration)  followed  by 
averaging  across  the  registered  data  set.  Different  translational  shift  estimators  will 
ultimately  yield  different  scene  estimates.  Therefore,  multiple  shift  estimators  can  be 
compared  by  using  the  frame  averaging  technique.  If  one  shift  estimator  ultimately 
yields  a  scene  estimate  with  a  higher  likelihood  than  another,  then  that  shift  estimator 
performed  better. 

Figure  9  shows  the  average  results  of  500  such  comparisons  for  3  to  30  frames. 
For  each  comparison  and  each  number  of  frames  a  unique  set  of  data  is  created  with 
Poisson  noise  and  Gaussian  random  shifting  (variance  of  2  pixels  in  both  dimensions). 
The  data  set  is  registered  using  the  new  multiframe  shift  estimator  (with  cross  correlation 
initial  estimator)  and  the  cross  correlation  multiframe  shift  estimator.  The  registered  data 
sets  are  then  averaged  across  the  number  of  frames  to  estimate  the  true  scene.  Finally, 
the  scene  estimates  are  shifted  such  that  they  are  centered  on  the  true  scene  in  order  to 
maximize  the  likelihood  functions  that  will  be  generated  and  to  ensure  that  neither  scene 
estimate  is  unfairly  biased.  The  true  scene  is  the  image  with  the  highest  likelihood  of 
occurrence,  therefore  providing  an  upper  bound  of  likelihood.  In  this  experiment  the 
simulated  32  pixel  by  32  pixel  point  source  image  of  figure  4  is  used. 
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Average  Log  Likelihood  Of  Frame  Averaging  Scene  Estimate 
Based  On  Poisson  Model  (500  Experiments  Per  Frame) 
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Figure  9.  Average  experimental  log  likelihood  of  the  frame  averaged  scene  estimate  as  a 

function  of  the  number  of  frames. 

It  is  plainly  visible  from  this  experiment  that  the  log  likelihood  of  the  scene 
estimation  resulting  from  the  new  multiframe  shift  estimator  is  substantially  better  than 
that  of  the  cross  correlation  multiframe  shift  estimator.  Furthermore,  it  is  apparent  that  as 
the  number  of  frames  increases,  the  log  likelihood  of  the  scene  estimation  resulting  from 
the  new  multiframe  shift  estimator  increases,  eventually  converging  to  some  value  less 
than  or  equal  to  the  log  likelihood  of  the  true  scene. 
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Estimator  Comparison  Based  On  Likelihood  Of  Data  Set 


If  the  true  scene  is  not  known,  such  as  in  a  realistic  data  collection  situation,  then 
a  slightly  different  comparison  method  must  be  used.  Just  as  an  individual  image  can  be 
modeled  as  a  joint  likelihood  function,  a  set  of  data  frames  can  also  be  modeled  as  a  joint 
likelihood  function.  Assuming  that  all  data  frames  within  a  given  set  are  independent  of 
one  another,  and  using  the  model  already  established  for  a  single  image  with  Poisson 
noise,  the  total  probability  of  receiving  a  particular  data  set  is 
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This  model  is  very  similar  to  equation  12,  except  that  it  computes  the  product  of  all  the 
probabilities  of  each  data  frame  occurring.  The  log  likelihood  version  of  this  model  is 
very  similar  to  equation  48,  except  that  it  adds  all  the  individual  log  likelihoods  of  each 
data  frame,  as  shown  in  equation  50. 
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For  a  particular  shift  estimator,  the  average  of  the  registered  frames  is  used  as  the 
true  scene.  The  log  likelihood  of  the  registered  data  set  occurring  is  computed  as  a  final 
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performance  metric,  representative  of  how  well  the  shift  estimator  performs.  Accurate 
shift  estimates  yield  higher  individual  data  frame  likelihoods.  Therefore  higher  overall 
data  set  likelihoods  indicate  better  shift  estimation.  Figure  10  shows  experimental  results 
comparing  the  new  multiframe  shift  estimator  and  the  cross  correlation  multiframe  shift 
estimator.  The  same  data  sets  from  the  previous  experiment  are  used.  Since  this  is  an 
experiment  and  the  true  scene  is  actually  known,  an  upper  performance  bound  can  be 
calculated.  For  a  data  set  of  K  frames  the  highest  possible  metric  that  can  be  acquired  K 
times  the  log  likelihood  of  the  known  true  scene.  This  boundary  is  shown  in  figure  10. 
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Average  Sum  Of  Data  Frame  Log  Likelihoods 
Based  On  Poisson  Model  (500  Experiments  Per  Frame) 


Figure  10.  Average  log  likelihood  of  a  particular  data  set,  using  the  frame  averaged 
scene  estimate  as  the  true  scene,  as  a  function  of  number  of  frames. 
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It  is  clear  in  figure  10  that  the  new  multiframe  shift  estimator  (using  cross  correlation 
initial  estimator)  performs  significantly  better  than  the  cross  correlation  multiframe  shift 
estimator. 

Figure  1 1  illustrates  the  log  likelihood  results  of  a  real  data  set  of  black  and  white 
photographs  of  a  coin,  located  12  inches  from  a  20  watt  halogen  lamp  light  source.  This 
data  is  from  a  handheld  digital  camera,  and  the  random  shifting  (resulting  from  ordinary 
hand  instabilities)  occurs  about  the  center  of  the  coin.  An  example  data  frame  is  shown 
in  figure  12.  For  this  experiment  data  sets  ranging  in  size  from  3  to  50  frames  are 
collected.  For  each  collection  the  new  multiframe  shift  estimator  (with  cross  correlation 
initial  estimator)  and  the  cross  correlation  multiframe  shift  estimator  are  used  to  estimate 
all  translational  shifts.  The  total  data  set  log  likelihood  for  each  collection  is  calculated 
(using  frame  averaging  of  the  respective  registered  data  sets  as  the  scene  estimate). 
Figure  1 1  clearly  shows  that,  for  this  data  set,  the  new  multiframe  shift  estimator 
performs  better  than  the  cross  correlation  multiframe  shift  estimator.  Furthermore,  it 
shows  that  as  the  number  of  data  frames  is  increased,  the  the  new  multiframe  shift 
estimator  performs  increasingly  better  compared  to  the  cross  correlation  multiframe  shift 
estimator. 
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Comparison  Of  Data  Set 
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Figure  11.  Comparison  of  data  set  log  likelihoods  for  3  to  50  frames  of  real  data.  The 
right  graph  shows  the  log  likelihood  difference  between  the  two  estimators,  such  that  a 
higher  positive  difference  indicates  a  higher  likelihood  of  the  new  multiframe  shift 

estimator. 
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Sample  Data  Frame  Of  Real  Data  Set 
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Figure  12.  Sample  data  frame  from  real  data  set.  The  data  set  contains  black  and  white 
photographs  of  a  nickel  on  a  dark  gray  background.  The  20  Watt  halogen  lamp  light 
source  was  located  12  inches  from  the  coin,  and  the  camera  was  located  approximately 

24  inches  from  the  coin. 
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V.  Conclusion 


Random  translational  shifting  is  a  common  problem  in  data  collected  from 
imaging  systems  mounted  on  a  moving  or  vibrating  platform.  This  shifting  degrades  the 
quality  of  information  in  the  data,  and  complicates  the  scene  estimation  process. 

However,  much  of  this  corruption  can  be  removed  if  reliable  and  accurate  translational 
shift  estimates  are  obtained. 

It  has  been  shown  that  there  exist  estimators  able  to  determine  the  translational 
shift  difference  between  2  frames  of  the  same  data  set.  More  specifically,  it  has  been 
shown  that  the  maximum  likelihood  estimator  in  the  presence  of  Gaussian  noise  is  the 
cross  correlation  shift  estimator.  While  there  exists  a  broad  variety  of  2  frame  shift 
estimation  techniques,  multiframe  shift  estimation  theory  is  relatively  undeveloped.  Most 
current  multiframe  shift  estimation  methods  are  merely  a  series  of  2  frame  based  shift 
estimations.  This  approach  can  suffer  from  skewed  performance  due  to  its  image 
dependence,  thereby  motivating  the  development  of  a  multiframe  shift  estimation 
approach  that  overcomes  any  image  dependence. 

The  new  multiframe  shift  estimator  presented  in  this  report  uses  existing  2  frame 
based  estimators  to  determine  shift  differences  among  a  set  of  data  frames,  and  applies 
prior  knowledge  of  the  shifting  to  remove  any  performance  skewing  due  to  particular 
image  dependencies.  This  results  in  estimates  that  are  more  accurate  than  the 
corresponding  2  frame  based  estimates  of  the  shifts.  This  new  approach  to  multiframe 
shift  estimation  effectively  uses  more  of  the  information  in  the  data  than  the  2  frame 
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approach.  As  a  result  it  performs  better  as  more  data  frames  are  included  in  the 
estimation  process.  Furthermore,  this  new  estimator  does  not  add  any  bias  to  the  initial 
estimator  that  drives  it.  Therefore,  if  the  initial  estimator  is  unbiased  (meaning  that  the 
expected  value  of  the  mean-square  error  is  zero)  then  the  corresponding  new  multiframe 
estimator  is  also  unbiased. 

It  has  been  shown  that  images  and  data  sets  can  be  modeled  as  joint  random 
functions  where  the  true  scene  is  the  mean  of  the  noise.  This  modeling  approach  allows 
performance  of  different  shift  estimators  to  be  compared  on  the  basis  of  total  likelihood. 
In  all  experiments  performed,  the  new  shift  estimator  yielded  results  with  a  higher 
likelihood  than  the  cross  correlation  estimator. 

The  lower  bound  of  performance  is  comprised  of  several  parts.  The  first  part  is  a 
result  of  the  data,  where  the  translational  shifts  are  treated  as  non-random  parameters. 

The  boundary  that  results  from  this  part  is  the  Cramer  Rao  lower  bound  of  variance,  and 
has  been  derived  for  the  2  frame  estimation  case  by  Robinson  and  Milanfar.  The  second 
part  of  the  performance  bound  accounts  for  the  translational  shifts  as  random  parameters, 
and  utilizes  any  prior  information  about  the  shifting.  The  final  result  is  a  lower  bound  on 
the  total  mean-square  error  of  a  particular  true  scene,  independent  of  any  estimator.  The 
multiframe  derivation  of  this  shows  that  the  performance  bound  asymptotically 
approaches  zero  as  the  number  of  frames  used  in  the  estimation  increases.  This  boundary 
equips  the  analyst  with  a  helpful  tool  for  determining  how  many  frames  to  include  in  the 
estimation  process. 
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Potential  further  research  in  this  area  includes  modeling  with  aliasing, 
optimization  for  real  time  implementation,  characterization  of  shifting  without  a  zero 
mean,  and  performance  comparison  using  initial  estimators  other  than  the  cross 
correlator. 

This  thesis  offers  the  field  of  image  processing  a  new  approach  to  multiframe 
shift  estimation.  Theoretical  proof  and  experimental  results  show  that  this  new  approach 
is  valid  and  useful  for  data  analysis.  The  performance  boundary  derivations  in  this  thesis 
provide  a  frame  of  reference  for  all  multiframe  translational  shift  estimation  problems. 
Current  research  in  estimation  theory  for  the  field  of  image  processing  has  moved 
forward  as  a  result  of  this  thesis. 
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